Construction and application of the genome-scale metabolic model of Streptomyces radiopugnans

Geosmin is one of the most common earthy-musty odor compounds, which is mainly produced by Streptomyces. Streptomyces radiopugnans was screened in radiation-polluted soil, which has the potential to overproduce geosmin. However, due to the complex cellular metabolism and regulation mechanism, the phenotypes of S. radiopugnans were hard to investigate. A genome-scale metabolic model of S. radiopugnans named iZDZ767 was constructed. Model iZDZ767 involved 1,411 reactions, 1,399 metabolites, and 767 genes; its gene coverage was 14.1%. Model iZDZ767 could grow on 23 carbon sources and five nitrogen sources, which achieved 82.1% and 83.3% prediction accuracy, respectively. For the essential gene prediction, the accuracy was 97.6%. According to the simulation of model iZDZ767, D-glucose and urea were the best for geosmin fermentation. The culture condition optimization experiments proved that with D-glucose as the carbon source and urea as the nitrogen source (4 g/L), geosmin production could reach 581.6 ng/L. Using the OptForce algorithm, 29 genes were identified as the targets of metabolic engineering modification. With the help of model iZDZ767, the phenotypes of S. radiopugnans could be well resolved. The key targets for geosmin overproduction could also be identified efficiently.


Introduction
Geosmin (trans-1,10-dimethyl-trans-9-decalol) is an irregular sesquiterpenoid of various actinomycetes and fungi, which has a distinct earthy or musty odor (Jiang et al., 2007). Geosmin is associated with the flavors in drinking water, wine, fish, and other foodstuffs. Several microorganisms, such as most Streptomyces (Jiang et al., 2006;Becher et al., 2020) and several species of cyanobacteria (Jiang et al., 2006;Giglio et al., 2008), myxobacteria (Dickschat et al., 2005), and fungi (Liato and Aider, 2017) can produce geosmin. Streptomyces radiopugnans belongs to the genus of Streptomyces and has been isolated from radiation-polluted soil from the Xinjiang Province in China (Mao et al., 2007). The genome of S. radiopugnans was sequenced in 2016 and can be accessed from the NCBI database. However, limited by the lack of experimental data, the regulation mechanism of geosmin biosynthesis was not clear in S. radiopugnans.
In this study, based on the genome sequence of S. radiopugnans, a GSMM named iZDZ767 was constructed. Model iZDZ767 contained 1,411 reactions, 1,399 metabolites, and 767 genes. Compared with experimental data, iZDZ767 could achieve 82.1% and 83.3% accuracy for the utilization of different carbon sources and nitrogen sources. In addition, the prediction accuracy of essential genes was 97.6%, compared with the DEG database (Luo et al., 2021). Then, based on iZDZ767, D-glucose and urea were identified as the most suitable carbon source and nitrogen source, respectively. The experiments proved that when urea was used as nitrogen and controlled at 4 g/L, geosmin production could reach 581.6 ng/L. Finally, 29 genes (seven upregulation, six downregulation, and 16 knockout targets) were identified as potential targets, which could improve the geosmin synthesis rate using the OptForce algorithm (Ranganathan et al., 2010). This study provides new insights that could be used to investigate the phenotype of S. radiopugnans and identify the metabolic engineering targets for geosmin overproduction.

Strain
The S. radiopugnans R97 T strain was screened from the contaminated radiation-contaminated area in Xinjiang, China (Mao et al., 2007).

Culture condition
For shake-flask cultivation, the S. radiopugnans strain was first cultured on an agar plate. Then, a single colony was selected to be cultured in 50 mL fresh medium until the OD 600 reached a value of 0.8. Finally, the strain was transferred into a 500 mL shake-flask containing 50 mL fermentation medium and cultivated at 30°C for 240 h with shaking at 200 rpm.
Determining the growth rate and glucose consumption rate The optical density (OD) was first measured at 600 nm with a spectrophotometer. The cell dry weight was then calculated by multiplying OD 600 by 0.36 g/L (Supplementary Figure S1) (Fischer and Sawers, 2013). The growth curve was fitted using the Logistic function of Origin software. Finally, the cell growth rate was calculated with differential values of cell dry weight (Supplementary Figure S2). Similarly, the glucose consumption rate was also calculated using Origin software, based on the experimental data (Supplementary Figure S3).

Geosmin extraction and analysis
The extraction and analysis of geosmin were followed by (Shen et al., 2021).

Prediction of optimized fermentation conditions
The robustness analysis [(controlFlux, objFlux) = robustnessAnalysis (model, controlRxn, nPoints, plotResFlag, objRxn, objType)] program was run in MATLAB to simulate the effect of the urea uptake rate on the synthesis rate of geosmin.

Hardware and software used for model construction and analysis
Detailed information is listed in Supplementary Table S1.

Model construction and characteristics analysis
To construct the genome-scale metabolic model of S. radiopugnans, several steps were carried out. First, ModelSEED (Henry et al., 2010) and CarveME (Machado et al., 2018) were used to construct the draft model of S. radiopugnans. Then, based on Frontiers in Bioengineering and Biotechnology frontiersin.org the KAAS annotation (Moriya et al., 2007) results, gaps were fixed by referring to KEGG pathway maps, manually. In addition, the biomass composition of S. radiopugnans was identified through literature mining, which includes proteins, RNA, DNA, lipids, cell wall, and small molecules (Supplementary Material S1). Finally, the defined model was mathematized with COBRA toolbox 3.0 (Heirendt et al., 2019). The model of S. radiopugnans consisted of 1,411 reactions, 1,399 metabolites, and 767 genes, and was named iZDZ767 (Table 1; Supplementary Material S1). Of these reactions, 88.0% were gene associated. According to the KEGG pathway maps, these reactions can be classified into 10 subsystems. Lipid metabolism, carbohydrate metabolism, and amino acid metabolism were the most common, accounting for 25.3%, 18.1%, and 16.3%, respectively ( Figure 1). The gene coverage of model iZDZ767 was 14.1%, which was close to the newest model iAA1259 of Streptomyces coelicolor (15.1%) (Amara et al., 2018).
Cytoscape software was used to analyze the network characteristics of model iZDZ767. There were 3,577 nodes and 8,706 edges in model iZDZ767.

Model verification
Based on the minimal culture medium, model iZDZ767 was used to simulate whether 28 carbon sources and 6 nitrogen sources could be utilized. For carbon sources, model iZDZ767 achieved 82.1% (23/28) correction. For the nitrogen sources, there was 83.3% (5/6) agreement with the experimental results, which could not grow with L-Cysteine as the sole nitrogen source (Table 2). In addition, the simulated maximum growth rate (μ max ) was 0.131 h −1 , which was only 4.4% lower than the measured growth rate (0.137 h −1 , Supplementary Figure S2). The essentialities of individual genes of S. radiopugnans were analyzed under minimal glucose medium conditions using iZDZ767 by deleting each gene in turn. The genes were categorized into three classes: essential genes, partially essential genes, and non-essential genes. There were 84 genes identified as essential genes. These genes were further compared with the DEG database (Luo et al., 2021), and 97.6% of the predicted essential genes could be matched by sequence blast (identity ≥30%,

FIGURE 1
The response distribution of the metabolic subsystem in model iZDZ767.
Frontiers in Bioengineering and Biotechnology frontiersin.org e-value ≤ 1e-6, Supplementary Material S2). These results proved that model iZDZ767 could predict the phenotype of S. radiopugnans well.

The optimization of culture condition with iZDZ767
The carbon source was a key factor for cell growth and product synthesis. Using model iZDZ767, the effect of different carbon sources, such as D-glucose, D-Fructose, Mannose, L-Rhamnose, and D-xylose was predicted. Of these selected carbon sources, D-glucose was the most suitable, the GPR was 2.04 mmol/gDW/h (Figure 2A). The experimental results show that when D-glucose was used as a carbon source, the yield of geosmin was 317.5 ng/L, which was higher than others ( Figure 2B). Similarly, three types of nitrogen sources were used for simulation. The model predicted that when urea was used as a nitrogen source, the GPR was 3.03 mmol/ gDW/h, which was 48.9% and 50.7% higher than NH 4 + and NO 3 − ( Figure 3A). Compared with the experiments, the geosmin yield was 435.1 ng/L, which agreed with the simulation ( Figure 3B). In addition, a robustness analysis algorithm was used to analyze the effect of the urea uptake rate on the geosmin production rate. The simulated results showed that with the increase in the urea uptake rate, the GPR would first increase to the maximum value, then remain stable until the urea uptake rate was over 500 mmol/gDW/h. Finally, the GPR would decrease to 0, which means that the suitable urea uptake rate should be 88.38 mmol/gDW/h ( Figure 3C). When we controlled the addition of urea at different levels, the experiment results proved that 4 g/L urea could achieve a maximum geosmin production of 581.6 ng/L ( Figure 3D).

Identification of the potential geosmin overproduction targets with iZDZ767
To identify potential targets for the improvement of geosmin production, the OptForce algorithm was used (Ranganathan et al., 2010). According to the predicted results, a total of 29 genes were identified as the targets, including seven upregulation, six downregulation, and 16 knockout targets (Supplementary Material S3). According to the function of each gene, these targets can be classified into four types: precursor accumulation, geosmin biosynthesis, by-product elimination, and energy supplement. For the geosmin synthesis pathway, the geoA gene, which encodes geosmin synthase, catalyzing the synthesis of geosmin from farnesyl diphosphate, should be upregulated (Shen et al., 2021). For by-product elimination, to accumulate more geosmin, the acnA gene (Aconitate hydratase A) should be downregulated to decrease the carbon flux of the TCA cycle (Figure 4). Similarly, the fabD gene [(acyl-carrier-protein) S-malonyltransferase] should also be downregulated to limit the flux of fatty acids synthesis. For energy supplements, the nuo gene (NADHquinone oxidoreductase) was predicted to be knocked out so that more NADH could be supplied for geosmin synthesis.

Discussion
Geosmin is a common pollutant and is widely recognized by the public, but this is not the case when studying some biological systems and organisms. Toxicological studies have shown that a certain concentration of geosmin could inhibit the growth of Salmonella typhimurium and sea urchin embryos (Dionigi et al., 1993;Nakajima et al., 1996). This provides a new direction for the study of how to inhibit pathogens. At the same time, researchers have also found that geosmin has a potential effect on genotoxicity. Geosmin is only mildly toxic at extremely high concentrations, far exceeding the actual level in the environment (Silva et al., 2015). Some researchers have found that geosmin, at a concentration of 50-5000 ng/L, can increase the body length and change the growthrelated genes of zebrafish (Zhou et al., 2020).
S. radiopugnans can be screened from radiation-contaminated soil and, although not widely studied, are capable of producing geosmin in large quantities. However, phenotypes of S.

Characteristics
In L-phenylalanine + + (Santhanam et al., 2012) L-Threonine + + (Santhanam et al., 2012) Frontiers in Bioengineering and Biotechnology frontiersin.org radiopugnans are difficult to study due to their complex cellular metabolism and regulatory mechanisms. Therefore, by manually refining the first genome-scale metabolic network model (iZDZ767) of S. radiopugnans, we analyzed the synthesis mechanism of geosmin and identified the key targets of geosmin synthesis based on the model, which provided a basis for further study of the synthesis of geosmin and the internal mechanism of S. radiopugnans.
Traditional model construction methods are mainly divided into automatic construction and manual construction. Among them, automatic construction automatically obtains the GSMM of the target strain or plant by uploading the genomic data to the  Frontiers in Bioengineering and Biotechnology frontiersin.org 05 existing tools. To date, researchers have developed many tools, such as ModelSEED (Flowers et al., 2018;Seaver et al., 2021), COBRA (Dal'molin et al., 2014;King et al., 2015), and RAVEN (Wang et al., 2018), for the automatic construction of models. The advantage of this construction method is that the model can be built in a short time, but the accuracy of the model is low and its applicability is not strong. The manual construction of this method mainly depends on the results of genome annotation, combined with the metabolic pathway of KEGG, it collates information, such as the genes and metabolites of each reaction, and manually adds it to the model. Considering the problems of traditional modeling methods, we used a semiautomatic modeling method. This method integrates the first two methods, first obtaining a coarse model through the automated construction tool, then manually refining the model so that a more accurate model can be obtained. Based on the automatic construction of the ModelSEED database and CarveME (Machado et al., 2018), combined with the complete metabolic pathway in KEGG, we added the missing reaction to the model and added the known metabolic pathway of geosmin to the model and, thus, manually refined a genome-scale metabolic network model of S. radiopugnans. Based on model iZDZ767, a series of strategies for improving geosmin were proposed. Although S. radiopugnans can make good use of microbial fermentation to produce geosmin, its metabolic network is complex, and the fermentation experiment period is long. It depends on repeated experiments to increase the production of geosmin, and the economic cost is high. Model iZDZ767 can predict the effects of carbon and nitrogen sources on the synthesis rate of geosmin well and provide directions for optimizing the culture conditions of S. radiopugnans. At the same time, the model is used to analyze the algorithm to predict the key targets for improving the geosmin synthesis rate. Through the analysis of these key targets, we found that the key reactions affecting the synthesis of geosmin are mainly divided into two types. One type of reaction is related to the synthesis of geosmin itself, while the other is related to the growth and reproduction of S. radiopugnans. No matter which type of reaction is upregulated, downregulated, or knocked out, the synthesis of geosmin can be effectively improved. Although the effectiveness of these regulation methods has not been proven, they provide the opportunity to study geosmin production by fermentation. In summary, model iZDZ767 is a powerful tool for analyzing and predicting the metabolic pathways and yields of various products in S. radiopugnans, which provides convenient conditions for us to study S. radiopugnans in the field of systems biology.

Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: https://www.uniprot.org/taxonomy/403935.

Author contributions
ZZ constructed the model and wrote this manuscript. QG analyzed data and drew figures. JQ answered the reviewers' questions. CY and HH proof-read the manuscript. All authors have read and approved the final manuscript.

Funding
This work was supported by the National Natural Science Foundation of China (32060004), the Xinjiang Academy of Agricultural Sciences Science and technology innovation key cultivation project (xjkcpy-2021002, xjkcpy-2022004), the Third Xinjiang Scientific Expedition Program (2022xjkk1200), and the "Outstanding Youth Fund" of Xinjiang Natural Science Foundation (2022D01E19).

FIGURE 4
Effect of the TCA cycle on the synthesis of geosmin.
Frontiers in Bioengineering and Biotechnology frontiersin.org